Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts

نویسندگان

Luis Fernando D'Haro

Ondrej Glembek

Oldrich Plchot

Pavel Matejka

Mehdi Soufifar

Ricardo de Córdoba

Jan Cernocký

چکیده

This paper describes a novel approach to phonotactic LID, where instead of using soft-counts based on phoneme lattices, we use posteriogram to obtain n-gram counts. The high-dimensional vectors of counts are reduced to low-dimensional units for which we adapted the commonly used term i-vectors. The reduction is based on multinomial subspace modeling and is designed to work in the total-variability space. The proposed technique was tested on the NIST 2009 LRE set with better results to a system based on using soft-counts (Cavg on 30s: 3.15% vs 3.43%), and with very good results when fused with an acoustic i-vector LID system (Cavg on 30s acoustic 2.4% vs 1.25%). The proposed technique is also compared with another low dimensional projection system based on PCA. In comparison with the original soft-counts, the proposed technique provides better results, reduces the problems due to sparse counts, and avoids the process of using pruning techniques when creating the lattices.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DNN senone MAP multinomial i-vectors for phonotactic language recognition

Deep neural networks have recently shown great promise for language recognition. In particular, the expected counts of clustered context-dependent phone states (senones) can serve as a simple but effective phonotactic system. This paper introduces multinomial i-vectors applied to senone counts and shows that they work better than current PCA approaches. In addition, we show that a new approach ...

متن کامل

iVector Approach to Phonotactic Language Recognition

This paper addresses a novel technique for representation and processing of n-gram counts in phonotactic language recognition (LRE): subspace multinomial modelling represents the vectors of n-gram counts by low dimensional vectors of coordinates in total variability subspace, called iVector. Two techniques for iVector scoring are tested: support vector machines (SVM), and logistic regression (L...

متن کامل

The LF Language Recognition System for NIST LRE 2011

This document presents a description of INESC-ID’s Spoken Language Systems Laboratory (LF) Language Recognition systems submitted to the 2011 NIST Language Recognition evaluation. The LF primary system consists of the fusion of six individual sub-systems: four phonotactic sub-systems and two acoustic based sub-systems. The major differences of the submitted LR system with respect to previous LF...

متن کامل

The LF Language Recognition System for Albayzin 2012 Evaluation

This document presents a description of INESC-ID’s Spoken Language Systems Laboratory (LF) systems submitted to the Albayzin 2012 Language Recognition evaluation. The submitted systems differ on the number of sub-systems selected for fusion and the back-end configuration. The basic set of sub-systems considered are four conventional phonotactic sub-systems based on n-gram modelling of phoneme s...

متن کامل

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts

نویسندگان

چکیده

منابع مشابه

DNN senone MAP multinomial i-vectors for phonotactic language recognition

iVector Approach to Phonotactic Language Recognition

The LF Language Recognition System for NIST LRE 2011

The LF Language Recognition System for Albayzin 2012 Evaluation

Allophone-based acoustic modeling for Persian phoneme recognition

عنوان ژورنال:

اشتراک گذاری